Skip to content

Conversation

@thomasdarimont
Copy link
Contributor

@thomasdarimont thomasdarimont commented Oct 21, 2023

  • Update maven-plugins in pom.xml
  • Update dependency versions in pom.xml
  • Remove model-assert (test) dependency to avoid using libraries with knwon CVEs
  • Replaced model-assert with assertj+gson
  • Removed unused imports
  • Made pointer field in ExtismCurrentPlugin private final
  • Made handle field in CancelHandle private final and exposed the cancel method (since it is a public type)

Update maven-plugins in pom.xml
Update dependency versions in pom.xml
Remove model-assert (test) dependency to avoid using libraries with knwon CVEs
Replaced model-assert with assertj+gson
Removed unused imports
Made pointer field in ExtismCurrentPlugin private final
Made handle field in CancelHandle private final and exposed the cancel method (since it is a public type)
- Instead of manually writing to memory, we now use plugin.returnString(...) in the host function tests
- Removed warnings from the tests
- Moved called instance-field in HostFunction to local variable
- Made fields un HostFunction private and prefer exposing read-only data
- Aling visibility of operatons in HostFunction
@thomasdarimont
Copy link
Contributor Author

thomasdarimont commented Oct 22, 2023

@bhelx I gave this another round of refactoring:

refactor: Revise PluginTests and used API

  • Instead of manually writing to memory, we now use plugin.returnString(...) in the host function tests
  • Removed warnings from the tests
  • Moved callback instance-field in HostFunction to local variable
  • Made fields un HostFunction private and prefer to expose read-only data
  • Align visibility of operations in HostFunction

refactor: Move Logging functionality to support package

  • Move deprecated log file configuration to dedicated support class
    Cleans up the Extism API surface, as deprecated methods are now moved to Logging support class.
  • Add log levels "off" and "error"
    Rust defines the additional log levels "off" and "error".

refactor: Prevent instantiation of support classes

refactor: Improve developer experience

  • Add immutable builder style API to Manifest
  • Removed Manifest#addSource in favour of Manifest#withAdditionalSource
  • Added convenience to Plugin

refactor: Add convenience methods to Extism for creating Manifests

  • Adjust tests to use the convenience methods

refactor: Simplify usage of UserData with HostFunction

  • Add overloaded constructor to HostFunction
    We can now pass UserData directly to the HostFunction constructor which simplifies it's usage.

@thomasdarimont
Copy link
Contributor Author

thomasdarimont commented Oct 22, 2023

@bhelx I think the API could be improve quite a bit.

Naming of ExtismFunction / HostFunction

Currently the naming of ExtismFunction and HostFunction. It seems that "HostFunction" is in fact a wrapper/container for the actual host function which is then expressed as ExtismFunction.

How about changing the names in the following way:
ExtismFunction -> HostFunctionCallback
HostFunction -> HostFunction

HostFunction state and userdata

Another problem I see with HostFunction is that it (currently) transports a lot of baggage that's publicly accessible. In my refactoring I made the fields private and moved the callback as a local variable. However, the passed in userdata is still part of the HostFunction but that feels wrong to me. I think Userdata should be remove from the HostFunction and should rather be passed when calling the plugin, e.g. plugin.call(functionName, "someInput", myUserData);

Is user data meant to be custom metadata that could be added to a host function registration, or is this rather "per" call additional (typed) input?

If UserData should remain in HostFunction, I'd remove Optional parameter from the constructor and add 2 overloads, one with a UserData parameter and one without. This makes the call-sites simpler.

API quite low-level

In general the whole interface seems to be a bit too low-level, passing around pointers and LibExtism.ExtismValType's is quite cumbersome and error-prone. Perhaps a better abstraction could hide this internals.
If users really have to work with the param and return types themselves, it might be cool to move the types LibExtism.ExtismValType, LibExtism.ExtismVal and LibExtism.ExtismValUnion as public static classes to Extism, e.g.
Extism.ValType, Extism.Val, Extism.ValUnion.

Plugin lifecycle / reuse

Another thing that's still unclear to me is how users would initialize the plugin in their applications:

        try (var plugin = new Plugin(manifest, true, null)) {
            var output = plugin.call(functionName, "someInput");
		   processOutput(output);
        }

Would they really create a new `Plugin instance per call, or would you rather create an instance per call?
If the later is the case, then I think that the manifest will be parsed over and over again for each call, which can take some time. I think this could be improved a bit, since the wasm code will probably not change, in most applications, across plugin calls. So perhaps something like a PluginManager could (lazily) load / instanciate plugins once and (or on request) and use a "cached" and "prepared" plugin for the actual call. On explicit request or when the application shuts down the PluginManager could destroy / close() the created plugin instances.
Are plugin instances intended to be shared or does every caller really need a dedicated plugin instance?

Logging integration

Currently, the deprecated org.extism.sdk.Extism#setLogFile is used to configure a log file through org.extism.sdk.LibExtism#extism_log_file. I think it doesn't look good to ship already deprecated functionality with a 1.0 release. How about providing a way to configure a logging callback in extism, which is called whenever a log line is generated in the library internally? With this in place, library integrations could register a logging callback as and adapt the log request to the logging facilities of the respective platform.

Remove unused components

Currently ManifestHttpRequest and org.extism.sdk.manifest.Manifest#allowedHosts are not used and could be removed. The functionality could be added later.

What do you think?

- Move deprecated log file configuration to dedicated support class
Cleans up the Extism API surface, as deprecated methods are now moved to Logging support class.
- Add log levels "off" and "error"
Rust defines the additional log levels "off" and "error".
@thomasdarimont
Copy link
Contributor Author

thomasdarimont commented Oct 22, 2023

I also just tried the java host function example from the documentation, but it seems to be broken / outdated:
https://extism.org/docs/integrate-into-your-codebase/java-host-sdk

The following host example would work (based on the current refactorings):

import org.extism.sdk.Extism;
import org.extism.sdk.ExtismFunction;
import org.extism.sdk.HostFunction;
import org.extism.sdk.HostUserData;
import org.extism.sdk.LibExtism.ExtismValType;
import org.extism.sdk.Plugin;

import java.nio.file.Path;

public class AppHostFuncs
{

  // we can create a custom type to pass complex, or compound, data 
  // through a host function. You could imagine this may hold a reference
  // to a database client or some other Java objects needed by the host functions
  public static class MyUserData extends HostUserData {

    private final String data1;

    private final int data2;

    public MyUserData(String data1, int data2) {
      this.data1 = data1;
      this.data2 = data2;
    }
  }

  // To create the host function we need a callback in Java world
  // and an ExtismFunction that references it
  public static HostFunction<?>[] getHostFunctions() {

    ExtismFunction<MyUserData> callbackFunc = (plugin, params, returns, data) -> {

      System.out.println("Hello from Java Host Function!");

      String wasmOutput = plugin.inputString(params[0]);
      System.out.printf("Input string received from plugin, %s%n", wasmOutput);

      data.ifPresent(d -> {
        System.out.printf("Host user data, %s, %d%n", d.data1, d.data2);

        // here you could deserialize the json into a java object again and process
        // with the help of the supplied user dataparams

        plugin.returnString(returns[0], "{ \"output\": " + wasmOutput + "}");
      });
    };

    var parametersTypes = new ExtismValType[]{ExtismValType.I64};
    var resultsTypes = new ExtismValType[]{ExtismValType.I64};

    var helloWorld = new HostFunction<>(
            "hello_world",
            parametersTypes,
            resultsTypes,
            callbackFunc,
            new MyUserData("test", 2)
    );

    return new HostFunction[]{helloWorld};
  }

  public static void main(String[] args) {

    var manifest = Extism.manifestFromPath(Path.of("code-functions.wasm"));
    var functions = getHostFunctions();

    // we must pass any host functions we created to the plugin constructor.
    // it will import these functions so the plugin can call them.
    try (var plugin = new Plugin(manifest, true, functions)) {
      var output = plugin.call("count_vowels", "Hello World");
      System.out.println(output);
    }
  }
}

The "simple" hello world looks like this with the refactorings:

import org.extism.sdk.Extism;
import org.extism.sdk.Plugin;

import java.nio.file.Path;

public class App 
{
    public static void main(String[] args)
    {
        var manifest = Extism.manifestFromPath(Path.of("code.wasm"));

        // NOTE: if you encounter an error such as: 
        // "Unable to load plugin: unknown import: wasi_snapshot_preview1::fd_write has not been defined"
        // change `false` to `true` in the following function to provide WASI imports to your plugin.
        try (var plugin = new Plugin(manifest, false)) 
        {
            var output = plugin.call("count_vowels", "Hello World");
            System.out.println(output);
        }
    }
}

The later almost becomes a one-liner:

import org.extism.sdk.Extism;
import org.extism.sdk.Plugin;

import java.nio.file.Path;

import static org.extism.sdk.Extism.manifestFromPath;

public class AppSlim {
    public static void main(String[] args) {
        var output = Extism.invokeFunction(manifestFromPath(Path.of("code.wasm")), "count_vowels", "Hello World");
        System.out.println(output);
    }
}

- Add immutable builder style API to Manifest
- Removed Manifest#addSource in favour of Manifest#withAdditionalSource
- Added convenience to Plugin
- Adjust tests to use the convenience methods
We can now pass UserData directly to the HostFunction constructor which simplifies it's usage.
@bhelx
Copy link
Contributor

bhelx commented Oct 22, 2023

This all looks great! I'm going to have to read it all and review on Monday, but overall I'm supportive of these changes.

func.invoke(currentPlugin, inputValues, outputValues, Optional.ofNullable(userData));

for (ExtismVal output : outputValues) {
convertOutput(output, output);
Copy link
Contributor Author

@thomasdarimont thomasdarimont Oct 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

convertOutput(output, output); looks wrong (no-op?), but is like that also in the current implementation here: https://github.com/extism/java-sdk/blob/main/src/main/java/org/extism/sdk/HostFunction.java#L46

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm yeah seems like a mistake for these to both be output

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I take another look at other implementations. If it turns out that this conversion is not necessary, then I remove it.

Copy link
Contributor Author

@thomasdarimont thomasdarimont Oct 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the original convertOutput method, called via convertOutput(output, output);

    void convertOutput(LibExtism.ExtismVal original, LibExtism.ExtismVal fromHostFunction) {
        if (fromHostFunction.t != original.t)
            throw new ExtismException(String.format("Output type mismatch, got %d but expected %d", fromHostFunction.t, original.t));

        if (fromHostFunction.t == LibExtism.ExtismValType.I32.v) {
            original.v.setType(Integer.TYPE);
            original.v.i32 = fromHostFunction.v.i32;
        } else if (fromHostFunction.t == LibExtism.ExtismValType.I64.v) {
            original.v.setType(Long.TYPE);
            original.v.i64 = fromHostFunction.v.i64;
        } else if (fromHostFunction.t == LibExtism.ExtismValType.F32.v) {
            original.v.setType(Float.TYPE);
            original.v.f32 = fromHostFunction.v.f32;
        } else if (fromHostFunction.t == LibExtism.ExtismValType.F64.v) {
            original.v.setType(Double.TYPE);
            original.v.f64 = fromHostFunction.v.f64;
        } else
            throw new ExtismException(String.format("Unsupported return type: %s", original.t));
    }

can be converted to:

    private void coerceType(ExtismVal value) {

        if (value.t == LibExtism.ExtismValType.I32.v) {
            value.v.setType(Integer.TYPE);
        } else if (value.t == LibExtism.ExtismValType.I64.v) {
            value.v.setType(Long.TYPE);
        } else if (value.t == LibExtism.ExtismValType.F32.v) {
            value.v.setType(Float.TYPE);
        } else if (value.t == LibExtism.ExtismValType.F64.v) {
            value.v.setType(Double.TYPE);
        } else {
            throw new ExtismException(String.format("Unsupported return type: %s", value.t));
        }
    }

with call coerceType(output);.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah makes sense. yeah the setType seems to be the only real mutation!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to do the same with the inputValues?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just adapted the code for the type coercion of the output parameters.

if (fromHostFunction.t != original.t)
throw new ExtismException(String.format("Output type mismatch, got %d but expected %d", fromHostFunction.t, original.t));
private InternalExtismFunction createCallbackFunction(ExtismFunction<T> func, T userData) {
return (Pointer pluginPointer, ExtismVal inputs, int nInputs, ExtismVal outputs, int nOutputs, Pointer data) -> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the data pointer is not used here, is wrong?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, we probably don't need Extism to keep track of this pointer since we're using a closure and Java keeps the userData in scope. The pointer is for languages or functions that can't do this and they need to look back up the object from a pointer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we then leave out the data parameter on the interface?
If not, then we should document that the data parameter is not used here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect we can leave it out, yes. If not then we should mark it as unused.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests worked fine with this interface definition

interface InternalExtismFunction extends Callback {

        /**
         * Host function implementation.
         */
        void invoke(Pointer plugin, ExtismVal inputs, int nInputs, ExtismVal outputs, int nOutputs
        //         , Pointer data
        );
    }

And this callback function:

 private InternalExtismFunction createCallbackFunction(ExtismFunction<T> func, T userData) {
        return (Pointer pluginPointer, ExtismVal inputs, int nInputs, ExtismVal outputs, int nOutputs
        //         , Pointer data
        ) -> {

            var outputValues = (ExtismVal[]) outputs.toArray(nOutputs);
            var inputValues = (ExtismVal[]) inputs.toArray(nInputs);
            var currentPlugin = new ExtismCurrentPlugin(pluginPointer);

            func.invoke(currentPlugin, inputValues, outputValues, Optional.ofNullable(userData));

            for (ExtismVal output : outputValues) {
                coerceType(output);
            }
        };
    }
    ```

@thomasdarimont
Copy link
Contributor Author

Closing this outdated PR for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants